Cross Validation Added #407

prakrutisingh24 · 2021-05-13T12:10:27Z

No description provided.

jaidevd · 2021-05-13T13:05:30Z

gramex/handlers/mlhandler.py

@@ -20,6 +20,9 @@
 from slugify import slugify
 from tornado.gen import coroutine
 from tornado.web import HTTPError
+from sklearn.metrics import get_scorer
+from sklearn.model_selection import cross_val_predict, cross_val_score
+from sklearn.model_selection import cross_val_predict, cross_val_score


This line appears twice.

This extra line is unnecessary.

jaidevd · 2021-05-13T13:06:43Z

gramex/handlers/mlhandler.py

            # train the model
            target = data[target_col]
            train = data[[c for c in data if c != target_col]]
+            # cross validation
+            mod = cls.modelFunction()


This is not required. The model is already present as cls.model, see line no: 116.

jaidevd · 2021-05-13T13:08:13Z

gramex/handlers/mlhandler.py

+            # cross validation
+            mod = cls.modelFunction()
+            CVscore = cross_val_score(mod, train, target)    
+            CV = sum(CVscore)/len(CVscore)


Use CVscore.mean()

Variable naming has to follow a specified style - do pip install flake8 and run the flake8 command against this file, i.e. flake8 mlhandler.py, and check the output.

jaidevd · 2021-05-13T13:13:34Z

@prakrutisingh24 In this PR, we are just computing the cross val score when the model is set up for the first time, and simply printing the CV score. What we need is:

When any scoring happens, especially if it is triggered by the user through an ?_action=train or ?_action=score - the result should be cross validated score
By default, cross validation should happen, but if the user wants, they should be able to turn it off through gramex.yaml, and get scores on the entire training dataset
sklearn's cross_val_scoire contains a parameter, cv, that can be set in many different ways. Check the docs for details. MLHandler users should be able to exercise all those options.

Thanks,

jaidevd · 2021-05-19T10:32:35Z

gramex/handlers/mlhandler.py

@@ -40,6 +44,8 @@
    'nums': [],
    'cats': [],
    'target_col': None,
+    'CV': True,
+    'CVargs': []


Let's have a single argument, cv, which can take any value, i.e in gramex.yaml, users should be able to write any of the following.

cv: false # disable cross validation cv: 5 # Use 5 folds cv: cv: 8 # Use 8 folds n_jobs: -1 # with an optional other parameter.

jaidevd · 2021-05-19T10:32:46Z

gramex/handlers/mlhandler.py

@@ -20,6 +20,9 @@
 from slugify import slugify
 from tornado.gen import coroutine
 from tornado.web import HTTPError
+from sklearn.metrics import get_scorer
+from sklearn.model_selection import cross_val_predict, cross_val_score
+from sklearn.model_selection import cross_val_predict, cross_val_score


This extra line is unnecessary.

jaidevd · 2021-05-19T10:36:24Z

gramex/handlers/mlhandler.py

+            # cross validation
+            print('yayyy we are here')
+            cls.CrossValidation(train,target)
+            print('should have printed')


Please remove the prints.

jaidevd

The training is happening in def _fit. Cross validation should also happen there.

jaidevd · 2021-05-25T05:39:43Z

gramex/handlers/mlhandler.py

@@ -20,6 +20,9 @@
 from slugify import slugify
 from tornado.gen import coroutine
 from tornado.web import HTTPError
+from sklearn.metrics import get_scorer
+from sklearn.model_selection import cross_val_predict, cross_val_score
+from ast import literal_eval


This should not be required.

jaidevd · 2021-05-25T05:39:50Z

gramex/handlers/mlhandler.py

@@ -40,6 +43,7 @@
    'nums': [],
    'cats': [],
    'target_col': None,
+    'CV': True,


Make it lowercase.

We have to support three cases for the cv option:

If the user sets cv: false - then no cross validation happens

If the user sets cv: 4 (or some other integer) pass it straight to cross_val_score

The default should be cv: None, and in this case, the user should not have to write anything in gramex.yaml

jaidevd · 2021-05-25T05:40:54Z

gramex/handlers/mlhandler.py

            # train the model
            target = data[target_col]
            train = data[[c for c in data if c != target_col]]
+            # cross validation
+            cls.CrossValidation(train,target)


Make it lowercase.

jaidevd · 2021-05-25T05:41:26Z

gramex/handlers/mlhandler.py

+        mclass = model_kwargs.get('class', False)
+        if mclass:
+            model = search_modelclass(mclass)(**model_kwargs.get('params', {}))
+            return model


This function is not required.

jaidevd · 2021-05-25T05:41:42Z

gramex/handlers/mlhandler.py

+        if CV:
+            CVscore = cross_val_score(mod, X=train, y=target, **literal_eval(json.dumps(CV)))
+            CVavg = sum(CVscore)/len(CVscore)
+            print('Cross Validation Score : ',CVavg)


CV should take place within the train method only.

if cv: cvscore = cross_val_score(mod, X=train, y=target, cv=cv) else: # Do the usual .fit

jaidevd · 2021-06-02T04:06:04Z

gramex/handlers/mlhandler.py

            # train the model
            target = data[target_col]
            train = data[[c for c in data if c != target_col]]
+            # cross validation
+            cls.cross_validation(train,target)


Not required here.

Prakruti Singh added 8 commits April 21, 2021 15:27

new scoring metrics added

cdc0e5a

new changes

e737117

made the requested changes

64e6c22

removed unnecessary files and print statements + fixed _predict()

babc4bf

removed empty lines and spaces + fixed if..else indentation

b1683b8

new changes

ddd5d73

added Cross Validation results

4b672cd

cross validation

1063207

jaidevd reviewed May 13, 2021

View reviewed changes

requested changes made

09b49f1

jaidevd requested changes May 19, 2021

View reviewed changes

single input for CV

5f24173

jaidevd requested changes May 25, 2021

View reviewed changes

final suggested changes done

eaa58dc

jaidevd reviewed Jun 2, 2021

View reviewed changes

Prakruti Singh added 2 commits June 3, 2021 11:18

Merge branch 'master' of https://github.com/gramener/gramex

e175cfe

resolved conflicts

5f631c1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross Validation Added #407

Cross Validation Added #407

prakrutisingh24 commented May 13, 2021

jaidevd May 13, 2021

jaidevd May 19, 2021

jaidevd May 13, 2021

jaidevd May 13, 2021

jaidevd commented May 13, 2021

jaidevd May 19, 2021

jaidevd May 19, 2021

jaidevd May 19, 2021

jaidevd left a comment

jaidevd May 25, 2021

jaidevd May 25, 2021

jaidevd May 25, 2021

jaidevd May 25, 2021

jaidevd May 25, 2021

jaidevd May 25, 2021

jaidevd May 25, 2021

jaidevd Jun 2, 2021

Cross Validation Added #407

Are you sure you want to change the base?

Cross Validation Added #407

Conversation

prakrutisingh24 commented May 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaidevd commented May 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaidevd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment